Submitted to Eurospeech’99, Budapest MULTI-STREAM SPEECH RECOGNITION: READY FOR PRIME TIME?

نویسندگان

  • Adam Janin
  • Dan Ellis
  • Nelson Morgan
چکیده

Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without overly increasing the complexity. However, they cannot be applied blindly. In this paper, we review our experience applying multi-stream and multiband methods to the Broadcast News corpus. We found that multi-stream systems using different acoustic front-ends provide a significant improvement over single stream systems. However, despite the fact that they have been successful on smaller tasks, we have not yet been able to show any gain using multi-band methods. We report various insights gained from the experience in applying these methods in a large-vocabulary task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-stream speech recognition: ready for prime time?

Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without overly increasing the complexity. However, they cannot be applied blindly. In this paper, we review our experience applying multi-stream and multiband methods to the Broadcast News corpus. We found that multi-stream systems using different acoustic front-ends provide a significant improvement over...

متن کامل

Submitted to Eurospeech’99, Budapest SPEECH/MUSIC DISCRIMINATION BASED ON POSTERIOR PROBABILITY FEATURES

A hybrid connectionist-HMM speech recognizer uses a neural network acoustic classifier. This network estimates the posterior probability that the acoustic feature vectors at the current time step should be labelled as each of around 50 phone classes. We sought to exploit informal observations of the distinctions in this posterior domain between nonspeech audio and speech segments well-modeled b...

متن کامل

Is ASR ready for wireless primetime: Measuring the core technology for selected applications

It is estimated that by the end of 2001 as many as 500 million people worldwide will use cellular services. The nature of hands-busy and eyes-busy situations inherent in the anywhere and anytime wireless communication paradigm presents exciting marketing opportunities and, at the same time, unique technical challenges to the current-generation ASR technology and their new applications. Current ...

متن کامل

Multimedia interaction for the new millennium

Spoken language processing has created value in multiple application areas such as document transcription, data base entry, and command and control. Recently scientists have been focusing on a new class of application that promises on-demand access to multimedia information such as radio and broadcast news. In separate research, augmenting traditional graphical interfaces with additional modali...

متن کامل

Using multiple time scales in a multi-stream speech recognition system

In this paper we propose and investigate a new approach towards using multiple time scale information in auto matic speech recognition ASR systems In this frame work we are using a particular HMM formalism able to process di erent input streams and to recombine them at some temporal anchor points While the phonological level of recombination has to be de ned a priori the op timal temporal ancho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999